Query Optimization

A slow query in production doesn't just slow down one user — it holds database connections, blocks other queries, and can cascade into an outage. Understanding how the database executes queries and where to intervene is a core senior engineering skill.

How the Query Optimizer Works

When you run a SQL query, the database doesn't execute it literally. The query optimizer:

Parses the SQL into a query tree
Generates multiple equivalent execution plans
Estimates the cost of each plan (using table statistics)
Picks the cheapest plan and executes it

The optimizer uses statistics (row counts, column cardinality, histogram of values) to estimate costs. Stale statistics lead to bad plans.

-- PostgreSQL: update statistics manually
ANALYZE orders;

-- MySQL: update statistics
ANALYZE TABLE orders;

EXPLAIN — Reading Execution Plans

EXPLAIN shows the plan without running the query. EXPLAIN ANALYZE runs it and shows actual times.

EXPLAIN ANALYZE
SELECT o.id, u.name, o.total
FROM orders o
JOIN users u ON u.id = o.user_id
WHERE o.status = 'PENDING'
AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC
LIMIT 20;

Limit  (cost=1234.56..1234.61 rows=20)
  → Sort  (cost=1234.56..1235.06 rows=200 width=72)
      Sort Key: o.created_at DESC
      → Hash Join  (cost=120.00..1225.00 rows=200 width=72)
          Hash Cond: (o.user_id = u.id)
          → Index Scan using idx_orders_status_created on orders o
              Index Cond: (status = 'PENDING')
              Filter: (created_at > (now() - '7 days'))
          → Hash  (cost=80.00..80.00 rows=3200 width=40)
              → Seq Scan on users u

What to look for

Node	Concern
`Seq Scan` on large table	Missing index — check if WHERE/JOIN columns are indexed
`Hash Join`	Good for large tables, bad if one side is huge (memory spill)
`Nested Loop`	Good if inner side is small; catastrophic if both sides are large
`Sort` without index	Expensive for large datasets; consider index on ORDER BY column
High `rows` estimate vs actual	Stale statistics — run `ANALYZE`
`Filter` after index scan	Index filters some rows, but rows fail the additional filter — consider composite index

The N+1 Query Problem

The most common performance bug in web applications. You run 1 query to fetch a list, then N more queries to fetch related data for each row.

// Bad — N+1
const orders = await db.query('SELECT * FROM orders WHERE user_id = $1', [userId]);
// ↑ 1 query

for (const order of orders) {
  order.items = await db.query(
    'SELECT * FROM order_items WHERE order_id = $1', [order.id]
  );
  // ↑ N queries — 1 per order
}

// If there are 50 orders → 51 total queries

Fix 1: JOIN

SELECT o.id, o.total, oi.product_id, oi.qty, oi.price
FROM orders o
JOIN order_items oi ON oi.order_id = o.id
WHERE o.user_id = $1;

One query. Combine in application code:

const rows = await db.query(joinQuery, [userId]);
const orders = groupBy(rows, 'id'); // group order_items by order id

Fix 2: Two queries with IN clause (batch fetch)

Better when the JOIN produces too many rows (many-to-many, large result sets).

// Query 1: get orders
const orders = await db.query('SELECT * FROM orders WHERE user_id = $1', [userId]);
const orderIds = orders.map(o => o.id);

// Query 2: get all items in one shot
const items = await db.query(
  'SELECT * FROM order_items WHERE order_id = ANY($1)', [orderIds]
);

// Combine in memory
const itemsByOrderId = groupBy(items, 'order_id');
for (const order of orders) {
  order.items = itemsByOrderId[order.id] || [];
}

2 queries instead of N+1. This is the DataLoader pattern (used in GraphQL).

Fix 3: ORM eager loading

// Sequelize
const orders = await Order.findAll({
  where: { userId },
  include: [{ model: OrderItem }],  // generates a JOIN or separate batch query
});

// Prisma
const orders = await prisma.order.findMany({
  where: { userId },
  include: { items: true },
});

Pagination — Offset vs Cursor

Offset pagination — the hidden performance problem

SELECT * FROM orders ORDER BY created_at DESC LIMIT 20 OFFSET 10000;

The database must scan and discard 10,000 rows before returning 20. At page 500 of a 10M row table, this becomes catastrophically slow.

Also inconsistent: if a row is inserted or deleted between page requests, items skip or duplicate.

Cursor-based pagination — the right way

-- First page
SELECT * FROM orders
WHERE created_at < NOW()
ORDER BY created_at DESC
LIMIT 20;

-- Next page — use the last row's created_at as cursor
SELECT * FROM orders
WHERE created_at < '2024-01-15 10:30:00'  -- cursor from last row
ORDER BY created_at DESC
LIMIT 20;

async function getOrders(cursor, limit = 20) {
  const query = cursor
    ? 'SELECT * FROM orders WHERE created_at < $1 ORDER BY created_at DESC LIMIT $2'
    : 'SELECT * FROM orders ORDER BY created_at DESC LIMIT $1';

  const params = cursor ? [cursor, limit] : [limit];
  const rows = await db.query(query, params);

  return {
    data: rows,
    nextCursor: rows.length === limit
      ? rows[rows.length - 1].created_at
      : null,
  };
}

No OFFSET — the index on created_at jumps directly to the cursor position. Performance is identical whether you're on page 1 or page 10,000.

Avoiding Common Slow Query Patterns

Functions on indexed columns break the index

-- Bad — YEAR() wraps the column, index on created_at is not used
WHERE YEAR(created_at) = 2024

-- Good — use a range that the index can satisfy
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'

-- Bad — LOWER() wraps the column
WHERE LOWER(email) = 'user@example.com'

-- Good — create a functional index, or store email in lowercase
CREATE INDEX idx_users_lower_email ON users (LOWER(email));

Implicit type conversion

-- Bad — user_id is INT, '42' is a string → implicit cast, index skipped
WHERE user_id = '42'

-- Good — match the column type
WHERE user_id = 42

SELECT * fetches unnecessary data

-- Bad — fetches all columns including large text/blob columns
SELECT * FROM products WHERE category_id = 5;

-- Good — fetch only what you need
SELECT id, name, price, thumbnail_url FROM products WHERE category_id = 5;

Especially important when you have TEXT, JSONB, or BYTEA columns — SELECT * forces the DB to deserialize everything.

Unbounded queries

-- Bad — could return millions of rows
SELECT * FROM events WHERE user_id = 42;

-- Good — always add LIMIT for list queries
SELECT * FROM events WHERE user_id = 42 ORDER BY created_at DESC LIMIT 100;

COUNT(*) on large tables

-- Slow on large tables — full scan
SELECT COUNT(*) FROM orders;

-- Use approximate count for display purposes
SELECT reltuples::BIGINT FROM pg_class WHERE relname = 'orders';

-- Or maintain a counter table
UPDATE counters SET value = value + 1 WHERE name = 'total_orders';

Query Optimization Techniques

Use EXISTS instead of COUNT for existence checks

-- Bad — counts all matching rows, then checks if > 0
SELECT COUNT(*) FROM orders WHERE user_id = 42 > 0;

-- Good — stops at first match
SELECT EXISTS(SELECT 1 FROM orders WHERE user_id = 42);

Batch writes instead of row-by-row inserts

-- Bad — 1000 round trips
for each item:
  INSERT INTO order_items (order_id, product_id, qty) VALUES (1, ?, ?);

-- Good — 1 round trip
INSERT INTO order_items (order_id, product_id, qty)
VALUES (1, 'P1', 2), (1, 'P2', 1), (1, 'P3', 5), ...;

// Node.js batch insert
const values = items.map((item, i) =>
  `($1, $${i * 2 + 2}, $${i * 2 + 3})`
).join(', ');

const params = [orderId, ...items.flatMap(i => [i.productId, i.qty])];

await db.query(`INSERT INTO order_items (order_id, product_id, qty) VALUES ${values}`, params);

Use CTEs for readable multi-step queries

WITH
pending_orders AS (
  SELECT * FROM orders WHERE status = 'PENDING' AND created_at < NOW() - INTERVAL '24 hours'
),
affected_users AS (
  SELECT DISTINCT user_id FROM pending_orders
)
SELECT u.email, COUNT(po.id) as stuck_orders
FROM users u
JOIN affected_users au ON au.user_id = u.id
JOIN pending_orders po ON po.user_id = u.id
GROUP BY u.email;

UPSERT instead of SELECT + INSERT/UPDATE

-- Bad — two round trips, race condition between SELECT and INSERT
SELECT * FROM product_views WHERE product_id = $1 AND user_id = $2;
-- If not exists:
INSERT INTO product_views ...;
-- If exists:
UPDATE product_views SET count = count + 1 ...;

-- Good — atomic UPSERT
INSERT INTO product_views (product_id, user_id, count)
VALUES ($1, $2, 1)
ON CONFLICT (product_id, user_id)
DO UPDATE SET count = product_views.count + 1;

Connection Pooling

Database connections are expensive — each one uses memory on the DB server and involves TCP handshakes. Never open a new connection per request.

// Bad — new connection per request
app.get('/orders', async (req, res) => {
  const client = new pg.Client({ connectionString });
  await client.connect();
  const result = await client.query('SELECT * FROM orders');
  await client.end();
  res.json(result.rows);
});

// Good — shared pool
import { Pool } from 'pg';

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,          // max connections in pool
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

app.get('/orders', async (req, res) => {
  const result = await pool.query('SELECT * FROM orders');
  res.json(result.rows);
});

Pool sizing formula

pool_size = Ncpu * 2 + 1   (PostgreSQL recommendation by PgBouncer team)

For a server with 4 CPUs:
pool_size = 4 * 2 + 1 = 9

Too many connections hurt performance (context switching, memory). Most Postgres instances support 100–200 connections. Use PgBouncer as a connection pooler when you have many app servers.

Slow Query Log

Find slow queries automatically.

PostgreSQL:

-- Log queries slower than 100ms
ALTER SYSTEM SET log_min_duration_statement = 100;
SELECT pg_reload_conf();

-- Or per session
SET log_min_duration_statement = 100;

MySQL:

SET GLOBAL slow_query_log = 1;
SET GLOBAL long_query_time = 0.1;  -- 100ms

Use tools like pgBadger (PostgreSQL) or pt-query-digest (MySQL) to analyze slow query logs.

Interview definition (short answer)

"Query optimization starts with EXPLAIN ANALYZE to understand the execution plan. The most common issues are: missing indexes, N+1 queries (fixed with JOINs or batch fetching), offset pagination (fixed with cursor-based), and functions wrapping indexed columns. Always use connection pooling, batch writes, and SELECT specific columns instead of SELECT *."